智能论文笔记

Motion Style Transfer: Modular Low-Rank Adaptation for Deep Motion Forecasting

Parth Kothari , Danya Li , Yuejiang Liu , Alexandre Alahi

分类：计算机视觉 | 机器人

2022-11-06

Deep motion forecasting models have achieved great success when trained on a massive amount of data. Yet, they often perform poorly when training data is limited. To address this challenge, we propose a transfer learning approach for efficiently adapting pre-trained forecasting models to new domains, such as unseen agent types and scene contexts. Unlike the conventional fine-tuning approach that updates the whole encoder, our main idea is to reduce the amount of tunable parameters that can precisely account for the target domain-specific motion style. To this end, we introduce two components that exploit our prior knowledge of motion style shifts: (i) a low-rank motion style adapter that projects and adjusts the style features at a low-dimensional bottleneck; and (ii) a modular adapter strategy that disentangles the features of scene context and motion history to facilitate a fine-grained choice of adaptation layers. Through extensive experimentation, we show that our proposed adapter design, coined MoSA, outperforms prior methods on several forecasting benchmarks.

translated by 谷歌翻译

Safety-compliant Generative Adversarial Networks for Human Trajectory Forecasting

Parth Kothari , Alexandre Alahi

分类：计算机视觉

2022-09-25

人群中的人类轨迹预测提出了建模社交相互作用和输出无碰撞多模式分布的挑战。在社会生成对抗网络（SGAN）成功之后，最近的作品提出了各种基于GAN的设计，以更好地模拟人群中的人类运动。尽管在降低基于距离的指标方面的性能卓越，但当前网络仍无法输出社会可接受的轨迹，这是模型预测中的高碰撞所证明的。为此，我们介绍了SGANV2：改进的符合安全性的SGAN架构，配备了时空交互模型和基于变压器的鉴别器。时空建模能力有助于更好地学习人类的社交互动，而基于变压器的歧视器设计改善了时间序列建模。此外，SGANV2即使在测试时间也通过协作抽样策略来利用学到的歧视者，该策略不仅完善了碰撞轨迹，而且还可以防止模式崩溃，这是GAN训练中的常见现象。通过对多个现实世界和合成数据集进行广泛的实验，我们证明了SGANV2提供社会兼容的多模式轨迹的功效。

translated by 谷歌翻译

DriverGym: Democratising Reinforcement Learning for Autonomous Driving

Parth Kothari , Christian Perone , Luca Bergamini , Alexandre Alahi , Peter Ondruska

分类：机器学习 | 计算机视觉

2021-11-12

尽管加强学习进展（RL）进展，但自主驾驶（广告）的开发算法仍然具有挑战性：缺乏能够培训的开源平台和有效地验证RL政策的关键问题之一。我们提出了一个用于开发自动驾驶的RL算法的开源Openai健身房兼容环境，用于开发RL算法。DriverGym提供访问超过1000小时的专家记录数据，并支持反应和数据驱动的代理行为。使用我们广泛灵活的闭环评估协议，可以在真实数据上轻松验证RL策略的性能。在这项工作中，我们还提供了使用监督学习和RL的行为克隆基线，驾驶员培训。我们制作驱动程序代码，以及公开的所有基线，以进一步刺激社区的发展。

translated by 谷歌翻译

Privately Estimating a Gaussian: Efficient, Robust and Optimal

Daniel Alabi , Pravesh K. Kothari , Pranay Tankala , Prayaag Venkat , Fred Zhang

分类： (统计)机器学习

2022-12-15

In this work, we give efficient algorithms for privately estimating a Gaussian distribution in both pure and approximate differential privacy (DP) models with optimal dependence on the dimension in the sample complexity. In the pure DP setting, we give an efficient algorithm that estimates an unknown $d$-dimensional Gaussian distribution up to an arbitrary tiny total variation error using $\widetilde{O}(d^2 \log \kappa)$ samples while tolerating a constant fraction of adversarial outliers. Here, $\kappa$ is the condition number of the target covariance matrix. The sample bound matches best non-private estimators in the dependence on the dimension (up to a polylogarithmic factor). We prove a new lower bound on differentially private covariance estimation to show that the dependence on the condition number $\kappa$ in the above sample bound is also tight. Prior to our work, only identifiability results (yielding inefficient super-polynomial time algorithms) were known for the problem. In the approximate DP setting, we give an efficient algorithm to estimate an unknown Gaussian distribution up to an arbitrarily tiny total variation error using $\widetilde{O}(d^2)$ samples while tolerating a constant fraction of adversarial outliers. Prior to our work, all efficient approximate DP algorithms incurred a super-quadratic sample cost or were not outlier-robust. For the special case of mean estimation, our algorithm achieves the optimal sample complexity of $\widetilde O(d)$, improving on a $\widetilde O(d^{1.5})$ bound from prior work. Our pure DP algorithm relies on a recursive private preconditioning subroutine that utilizes the recent work on private mean estimation [Hopkins et al., 2022]. Our approximate DP algorithms are based on a substantial upgrade of the method of stabilizing convex relaxations introduced in [Kothari et al., 2022].

translated by 谷歌翻译

Automated Deep Aberration Detection from Chromosome Karyotype Images

Zahra Shamsi , Drew Bryant , Jacob Wilson , Xiaoyu Qu , Avinava Dubey , Konik Kothari , Mostafa Dehghani , Mariya Chavarha , Valerii Likhosherstov , Brian Williams

分类：计算机视觉 | 机器学习

2022-11-20

Chromosome analysis is essential for diagnosing genetic disorders. For hematologic malignancies, identification of somatic clonal aberrations by karyotype analysis remains the standard of care. However, karyotyping is costly and time-consuming because of the largely manual process and the expertise required in identifying and annotating aberrations. Efforts to automate karyotype analysis to date fell short in aberration detection. Using a training set of ~10k patient specimens and ~50k karyograms from over 5 years from the Fred Hutchinson Cancer Center, we created a labeled set of images representing individual chromosomes. These individual chromosomes were used to train and assess deep learning models for classifying the 24 human chromosomes and identifying chromosomal aberrations. The top-accuracy models utilized the recently introduced Topological Vision Transformers (TopViTs) with 2-level-block-Toeplitz masking, to incorporate structural inductive bias. TopViT outperformed CNN (Inception) models with >99.3% accuracy for chromosome identification, and exhibited accuracies >99% for aberration detection in most aberrations. Notably, we were able to show high-quality performance even in "few shot" learning scenarios. Incorporating the definition of clonality substantially improved both precision and recall (sensitivity). When applied to "zero shot" scenarios, the model captured aberrations without training, with perfect precision at >50% recall. Together these results show that modern deep learning models can approach expert-level performance for chromosome aberration detection. To our knowledge, this is the first study demonstrating the downstream effectiveness of TopViTs. These results open up exciting opportunities for not only expediting patient results but providing a scalable technology for early screening of low-abundance chromosomal lesions.

translated by 谷歌翻译

Deep Learning Driven Natural Languages Text to SQL Query Conversion: A Survey

Ayush Kumar , Parth Nagarkar , Prabhav Nalhe , Sanjeev Vijayakumar

分类：自然语言处理 | 人工智能

2022-08-08

随着未来以数据为中心的决策，对数据库的无缝访问至关重要。关于创建有效的文本到SQL（Text2SQL）模型以访问数据库的数据有广泛的研究。使用自然语言是可以通过有效访问数据库（尤其是对于非技术用户）来弥合数据和结果之间差距的最佳接口之一。它将打开门，并在精通技术技能或不太熟练的查询语言的用户中引起极大的兴趣。即使提出或研究了许多基于深度学习的算法，在现实工作场景中使用自然语言来解决数据查询问题仍然非常具有挑战性。原因是在不同的研究中使用不同的数据集，这带来了其局限性和假设。同时，我们确实缺乏对这些提议的模型及其对其训练的特定数据集的局限性的彻底理解。在本文中，我们试图介绍过去几年研究的24种神经网络模型的整体概述，包括其涉及卷积神经网络，经常性神经网络，指针网络，强化学习，生成模型等的架构。我们还概述11个数据集，这些数据集被广泛用于训练Text2SQL技术的模型。我们还讨论了无缝数据查询中文本2SQL技术的未来应用可能性。

translated by 谷歌翻译

Studying writer-suggestion interaction: A qualitative study to understand writer interaction with aligned/misaligned next-phrase suggestion

Advait Bhat , Saaket Agashe , Niharika Mohile , Parth Oberoi , Ravi Jangir , Anirudha Joshi

分类：人工智能

2022-08-01

我们提出了一项探索性定性研究，以了解作家如何与下一页建议相互作用。尽管对建议系统对写作的影响进行了一些定量研究，但几乎没有定性的工作来理解作家如何与建议系统互动及其如何影响他们的写作过程 - 特别是针对非本地但英国作家的。我们进行了一项研究，要求业余作家分别写两部电影评论，一本没有建议。我们发现作家以各种复杂的方式与下一页建议互动 - 作家能够抽象建议的多个部分并将其纳入他们的写作中 - 即使他们不同意整个建议。建议系统对写作过程也有各种影响 - 以独特的方式为写作过程的不同方面做出了影响。我们提出了一种用于与GPT-2写作的作家 - 探索互动模型，用于电影评论写作任务，然后是该模型可用于未来研究的方式，并概述了研究和设计的机会。

translated by 谷歌翻译

Adaptive Fine-Grained Sketch-Based Image Retrieval

Ayan Kumar Bhunia , Aneeshan Sain , Parth Shah , Animesh Gupta , Pinaki Nath Chowdhury , Tao Xiang , Yi-Zhe Song

分类：计算机视觉

2022-07-04

最近对基于细粒的基于草图的图像检索（FG-SBIR）的重点已转向将模型概括为新类别，而没有任何培训数据。但是，在现实世界中，经过训练的FG-SBIR模型通常应用于新类别和不同的人类素描器，即不同的绘图样式。尽管这使概括问题复杂化，但幸运的是，通常可以使用一些示例，从而使模型适应新的类别/样式。在本文中，我们提供了一种新颖的视角 - 我们没有要求使用概括的模型，而是提倡快速适应的模型，在测试过程中只有很少的样本（以几种方式）。为了解决这个新问题，我们介绍了一种基于几个关键修改的基于新型的模型 - 静态元学习（MAML）框架：（1）作为基于边缘的对比度损失的检索任务，我们简化了内部循环中的MAML训练使其更稳定和易于处理。（2）我们的对比度损失的边距也通过其余模型进行了元学习。（3）在外循环中引入了另外三个正规化损失，以使元学习的FG-SBIR模型对类别/样式适应更有效。在公共数据集上进行的广泛实验表明，基于概括和基于零射的方法的增益很大，还有一些强大的射击基线。

translated by 谷歌翻译

List-Decodable Covariance Estimation

Misha Ivkov , Pravesh K. Kothari

分类：机器学习 | (统计)机器学习

2022-06-22

我们给出了\ emph {list-codobable协方差估计}的第一个多项式时间算法。对于任何$ \ alpha> 0 $，我们的算法获取输入样本$ y \ subseteq \ subseteq \ mathbb {r}^d $ size $ n \ geq d^{\ mathsf {poly}（1/\ alpha）} $获得通过对抗损坏I.I.D的$（1- \ alpha）n $点。从高斯分布中的样本$ x $ size $ n $，其未知平均值$ \ mu _*$和协方差$ \ sigma _*$。在$ n^{\ mathsf {poly}（1/\ alpha）} $ time中，它输出$ k = k（\ alpha）=（1/\ alpha）^{\ mathsf {poly}的常数大小列表（1/\ alpha）} $候选参数，具有高概率，包含$（\ hat {\ mu}，\ hat {\ sigma}）$，使得总变化距离$ tv（\ Mathcal {n}（n}）（n}（n}）（ \ mu _*，\ sigma _*），\ Mathcal {n}（\ hat {\ mu}，\ hat {\ sigma}））<1-o _ {\ alpha}（1）$。这是距离的统计上最强的概念，意味着具有独立尺寸误差的参数的乘法光谱和相对Frobenius距离近似。我们的算法更普遍地适用于$（1- \ alpha）$ - 任何具有低度平方总和证书的分布$ d $的损坏，这是两个自然分析属性的：1）一维边际和抗浓度2）2度多项式的超收缩率。在我们工作之前，估计可定性设置的协方差的唯一已知结果是针对Karmarkar，Klivans和Kothari（2019），Raghavendra和Yau（2019和2019和2019和2019和2019年）的特殊情况。 2020年）和巴克西（Bakshi）和科塔里（Kothari）（2020年）。这些结果需要超级物理时间，以在基础维度中获得任何子构误差。我们的结果意味着第一个多项式\ emph {extcect}算法，用于列表可解码的线性回归和子空间恢复，尤其允许获得$ 2^{ - \ Mathsf { - \ Mathsf {poly}（d）} $多项式时间错误。我们的结果还意味着改进了用于聚类非球体混合物的算法。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译